AP® STATISTICS 
2007 SCORING GUIDELINES (Form B) 


Question 5 


Intent of Question 


The primary intent of this question is to assess a student’s ability to make an inference about the difference in two 
population means using a two-sample f-test, including the identification of the null and alternative hypotheses and 
good communication of test results and conclusions. 


Solution 
Part 1: States a correct pair of hypotheses 


H,: Ms — Hy =0 versus H, : fl, — My <0 
OR 

Hi My —Ms =9 versus H, : uy — Ms > 0 
OR 

His =My versus H, : Uy < My 


where “4, = mean decrease in cholesterol for standard drug 


and gl, = mean decrease in cholesterol for new drug. 


Part 2: Identifies a correct test (by name or by formula) and checks appropriate assumptions. 


xX,—-X 
Two-sample f-test (or z-test) t= - x 
53 SN 
ny Ny 
Assumptions: 


1. random assignment of subjects to treatments; 
2. normal population distributions or large samples. 


Both sample sizes were large (50), and there was random assignment of subjects to treatments. 
NOTE: A two-sample z-test is acceptable as long as the large sample sizes are noted. A pooled t-test is also 
acceptable, but the student must also state and comment on the plausibility of the equal population variances 


assumption. 


Part 3: Correct mechanics, including the value of the test statistic, df (stated or implied by calculator work), and 
p-value (or rejection region). 


pa Me Koi 10-18: _ 5g 


Shah ee 
Ny Ny 50 50 
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AP® STATISTICS 
2007 SCORING GUIDELINES (Form B) 


Question 5 (continued) 


df= 85, p-value = 0.000088, or from table p-value < 0.001 
OR 

df= 49, p-value = 0.00014. 

For two-sample z-test, z = —3.92,, p-value = 0.000044. 
For pooled t-test, t = —3.92, df= 98, p-value = 0.000081. 


Rejection regions: 

a=0.10: t < —1.303 (df= 40), t < —1.292 (df= 80), t < —1.290 (df= 100) OR z < —1.28 

a=0.05: t < —1.684 (df= 40), t < —1.664 (df= 80), t < —1.660 (df= 100) OR z < —1.645 
a=0.01: t < —2.423 (df= 40), t < —2.374 (df= 80), t < —2.364 (df= 100) OR z < —2.33 


Part 4: Stating a correct conclusion in the context of the problem, using the result of the statistical test. 


Because the p-value < selected a (or because the p-value is so small), reject H,. There is convincing 
evidence that the mean cholesterol reduction is greater for the new drug. 


Scoring 
Each part is scored as either essentially correct (E), partially correct (P), or incorrect (I). 
Part 1 is essentially correct (E) if the response: 
1. includes the correct pair of hypotheses; 
2. defines the parameters in the hypotheses in the context of the problem. 
Part | is partially correct (P) if the hypotheses are stated correctly, but notation is not defined. 
Part 2 is essentially correct (E) if the response: 
1. identifies the correct test by name or formula; 
2. checks appropriate assumptions (including equal variance if pooled t-test is used). 
Part 2 is partially correct (P) if it includes only one of the two elements above. 
Part 3 is essentially correct (E) if the response includes: 
1. correct mechanics, including the value of the test statistic; 
2. df and p-value or rejection region consistent with the hypotheses in part 1. 
Part 3 is partially correct (P) if it includes only one of the two elements above. 
Part 4 is essentially correct (E) if the response includes: 
1. aconclusion in context consistent with the hypotheses in part 1; 


2. linkage between the results of the test in part 3 and the conclusion, and this is communicated well. 


Part 4 is partially correct (P) if it includes only one of the two elements above. 
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AP® STATISTICS 
2007 SCORING GUIDELINES (Form B) 


Question 5 (continued) 


e Ifboth an a and a p-value are given, the linkage is implied. If no a is given, the solution must be explicit 
about the linkage by giving a correct interpretation of the p-value or explaining how the conclusion 
follows from the p-value. 

e Ifthe p-value in part 3 is incorrect but the conclusion is consistent with the computed p-value, part 4 can 
be considered essentially correct (E). 


4 


OR 


OR 


OR 


OR 


Complete Response 

All four parts essentially correct 

Substantial Response 

Three parts essentially correct and no parts partially correct 
Two parts essentially correct and two parts partially correct 
Developing Response 

Two parts essentially correct and no parts partially correct 
One part essentially correct and two parts partially correct 
Four parts partially correct 

Minimal Response 

One part essentially correct and no parts partially correct 


No parts essentially correct and two parts partially correct 


If a response is between two scores (for example, 2’ points), use a holistic approach to determine whether 
to score up or down depending on the strength of the response and communication. 
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5A 


5. A serum cholesterol level above 250 milligrams per deciliter (mg/dl) of blood is a risk factor for cardiovascular 
disease in humans. At a medical center in St. Louis, a study to test the effectiveness of a new cholesterol- 
lowering drug was conducted. One hundred people with cholesterol levels between 250 mg/dl and 300 mg/dl 
were available for this study. Fifty people were assigned at random to each of two treatment groups. One group 
received the standard cholesterol-lowering medication and the other group received the new drug. After taking 
the drug for three weeks, the 50 subjects who received the standard treatment had a mean decrease in cholesterol 
level of 10 mg/dl with a standard deviation of 8 mg/dl, and the 50 subjects who received the new drug had a 
mean decrease of 18 mg/dl with a standard deviation of 12 mg/dl. 


Does the new drug appear to be more effective than the standard treatment in lowering mean cholesterol level? 
Give appropriate statistical evidence to support your conclusion. 


Yo: the Now AVY 's eceechiVeveess equals that of Me ala drug. Mo = My 
Ha: the new Aun Is Move effective thane old drug: Mey 


A p= Mean decease in molectyo | level fv old Aruq Xy = (8 6 
M= mean decrease in crolectal level Gr new drug) Susie Soot 
Vn = L=9) Vig > SO 
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T. 2496 a> 
Wn= 20 240 — Foret samples, ne 


V, =Od0 2 ad —~ entra Limit p-value “. 000s wnich Means 
Thetrem is catstizd 
' ‘ atk 4 Yova bility 1s cianifican 
vopelatin, 2 jOnge A Rarenalty Anat the p ° TY 1S significantly 
ok =10(%)2 CY SMA. Reject Ho, assume tar 
POPULATION, 2 (au.=io(s) = 
ve Ha 1s +e. 


“ AML wea) Avg, seems to be 


Maire ee cri Ye at lowering wean 
Molesterel level. 


| 
pees GO ON TO THE NEXT PAGE. 
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5B 


5. A serum cholesterol level above 250 milligrams per deciliter (mg/dl) of blood is a risk factor for cardiovascular 
disease in humans. At a medical center in St. Louis, a study to test the effectiveness of a new cholesterol- 
lowering drug was conducted. One hundred people with cholesterol levels between 250 mg/dl and 300 mg/dl 
were available for this study. Fifty people were assigned at random to each of two treatment groups. One group 
received the standard cholesterol-lowering medication and the other group received the new drug. After taking 
the drug for three weeks, the 50 subjects who received the standard treatment had a mean decrease in cholesterol 
level of 10: mg/dl with a standard deviation of 8 mg/dl, and the 50 subjects who received the new drug had a 
mean decrease of 18 mg/dl with a standard deviation of 12 mg/dl. 


Does the new drug appear to be more effective than the standard treatment in lowering mean cholesterol level? - 
Give appropriate statistical evidence to support your conclusion. 


Hoe aA =z Thre ssid Cnottsitvel-lovrevivy wreaicatiov 
ONO YF vere OWE) neve tet SUYYre 
RESTHINT VES | 


Wut 2a, Wat Standard Chotestevel loweeving woeciiceitey 
1S Fess EPCOT Haun tine ews OO. 


Two-Sawmote [- FSy 


Condiont 
SRI 
{oO XIU < PoP J | 
Normul ¥ ae 1-18 
| (a = -3-927 
So + Sp 


Win pevalut oF -000088% 
Wwe CUN assorre TUT OY Yesu lis 


Weve HQNIFiLUNT ANH NO DUETS 
CWanet. Tre vyew Ue, SS mort. 


eRe We thay ahs Standare 
Crigsycvol low tv wed caution. 


See GO ON TO THE NEXT PAGE. 
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IC 


5. A serum cholesterol level above 250-milligrams per deciliter (mg/dl) of blood is a risk factor for cardiovascular 
disease in humans. At a medical center in St. Louis, a study to test the effectiveness of a new cholesterol- 
lowering drug was conducted. One hundred people with cholesterol levels between 250 mg/dl and 300 mg/dl 
were available for this study. Fifty people were assigned at random to each of two.treatment groups. One group 
received the standard cholesterol-lowering medication and the other group received the new drug. After taking 
the drug for three weeks, the 50 subjects who received the standard treatment had a mean decrease in cholesterol 
level of 10 mg/d] with a standard deviation of 8 mg/dl, and the 50 subjects who received the new drug had a 
mean decrease of 18 mg/dl with a standard deviation of 12 mg/dl. 


Does the new drug appear to be more effective than the standard treatment in lowering mean cholesterol level? 
Give appropriate statistical evidence to support your conclusion. 


N=100(humber i Boag ea avoi \oole) 
=n = 5a (the o dkPare nt treatm grops) 
Pa 


Let TP, be the group thok takes Me standard treatment 
Nhe te an thads +oukeg “re inew dra). | 


ol = 10 


Lt | HK lew. ia, ae ee ee 
bbe the trae hee riase hy 
ba the atondard dewacher aan 


it tt ut ut 


Op | = 
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Alternate ‘ K-w Lg 


a. a Te og ae . ae 
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any part of this page is illegal. GO ON TO THE NEXT PAGE. 


-12- 


©2007 The College Board. All rights reserved. 
Visit apcentral.collegeboard.com (for AP professionals) and www.collegeboard.com/apstudents (for students and parents). 


AP® STATISTICS 
2007 SCORING COMMENTARY (Form B) 


Question 5 


Sample: 5A 
Score: 4 


This is a complete response. It clearly identifies the appropriate null and alternative hypotheses in the context of 
comparing mean cholesterol reduction provided by two drugs. A two-sample [-test is identified by presenting the 
formula for the test with the correct degrees of freedom. This response uses the moderately large sample sizes of 
50 in each treatment group to justify use of the f-test, although use of the test could be justified by the act of 
randomly assigning treatments to subjects. There is no information in the statement of the problem about whether 
or not the subjects were sampled from some larger population. The value of the test statistic is correctly 
computed, and an upper bound for a p-value is obtained up to the accuracy allowed by the t-table provided with 
the exam. The conclusion that the new treatment provides a greater mean decrease in cholesterol level than the old 
treatment is well supported by indicating that the p-value is sufficiently small. 


Sample: 5B 
Score: 3 


This is a substantial response. It identifies the appropriate null and alternative hypotheses using both words and 
symbols. A two-sample f-test is indicated, but the justification for using a two-sample t-test is incomplete. The 
problem provides no information about normal distributions or sampling the subjects from any population as 
indicated in the condition checks provided in the response. The random assignment of subjects to treatment 
groups and sufficiently large numbers of subjects should have been used to justify the tests. The value of the t- 
statistic and corresponding p-value are correctly computed. The conclusion that the new treatment provides a 
greater mean decrease in cholesterol level than the old treatment is not supported by comparing the p-value to a 
level of significance. 


Sample: 5C 
Score: 2 


This is a developing response that does not correctly perform the f-test. It identifies the appropriate null and 
alternative hypotheses and identifies notation corresponding to the two treatment groups. There is some indication 
that sample sizes larger than 30 are needed, but this is not explicitly linked to the use of the two-sample f-test. No 
testing procedure is explicitly identified. A formula for the standard error of the difference between the two 
sample means is given, and it is incorrectly used as a test statistic. This leads to an incorrect p-value. The 
conclusion is consistent with the computed p-value and it is given in context, but the level of communication is 
not as good as in the previous two responses. 
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